set.seed(15)
library(ggplot2)
library(reshape2)
The observations along the tusk are denoted \(Y_i\) for \(i=1, \ldots, n\), with the corresponding position along the tusk denoted \(x_i\). The model is the following \[Y_i = f(x_i, \varphi)+\varepsilon_i \] with \(\varepsilon_i\) a random noise assumed to be normally distributed with mean \(0\) and variance \(\omega^2\). The regression function is a periodic sinusoidal function \[f(x, \varphi) = A \sin(g(x)+b) + B\sin(2g(x)+2b+\pi/2) \] with function \(g\) defined as \[ g(x) = ax+\xi_x\] and finally \(\xi_x\) is assumed to be a random Ornstein-Uhlenbeck process \[d\xi_x = -\beta \xi_xdx+\sigma dW_x \]
The objective is to estimate the parameters \(\varphi = (A,B,a,b)\); \(\psi = e^{\beta\Delta}\) where \(\Delta\) is the step size between two observations and \(\gamma^2 = \frac{\sigma^2}{2\beta}(1-e^{-2\beta\Delta})\).
Let us start with some simulations.
source("../../R/2_tusk/utils/simulation.R")
xi <- rxi()
Y <- f(xi) + rnorm(n, 0, omega)
plot(A * sin(g(rep(0, n), a) + b), type = "l")
plot(B * sin(2 * g(rep(0, n), a) + 2 * b + pi / 2), type = "l")
plot(f(rep(0, n)), type = "l")
plot(xi, type = "l")
plot(Y, type = "l")
The EM algorithm is based on the complete log-likelihood of the model. The solution of the hidden process \(\xi_x\) is \[\xi_{x+\Delta} = \xi_x e^{-\beta\Delta} + \int_{x}^{x+\Delta} \sigma e^{\beta(s-x)}dW_s \] such that the transition density is \[p(\xi_{x+\delta}|\xi_x) = \mathcal{N}(\xi_x e^{-\beta\Delta}, \frac{\sigma^2}{2\beta}(1-e^{-2\beta\Delta}))\]
The complete log-likelihood is thus \[\begin{eqnarray*} \log L(Y,\xi,\theta)&=&\sum_{i=1}^n\log p(Y_i|\xi_i) +\sum_{i=1}^n\log p(\xi_{i}|\xi_{i-1}) +\log p(\xi_1)\\ &=& -\sum_{i=1}^n \frac{(Y_i-f(x_i, \varphi))^2}{2\omega^2}-\frac{n}{2}\log(\omega^2)\\ &&- \sum_{i=1}^n\frac{(\xi_i-\xi_{i-1}e^{-\beta\Delta})^2}{\frac{\sigma^2}{\beta}(1-e^{-2\beta\Delta})} - \frac{n}{2}\log(\frac{\sigma^2}{2\beta}(1-e^{-2\beta\Delta})) \\ &=& -\sum_{i=1}^n \frac{(Y_i-f(x_i, \varphi))^2}{2\omega^2}-\frac{n}{2}\log(\omega^2)\\ &&- \sum_{i=1}^n\frac{(\xi_i-\xi_{i-1}\psi)^2}{2\gamma^2} - \frac{n}{2}\log(\gamma^2) \end{eqnarray*}\]
The EM algorithm proceeds at iteration \(k\) with the two following steps, given the current value of the parameters \(\theta^k\)
The condition distribution \(p(\xi|Y; \theta^k)\) is not explicit because the regression function is not linear. We should proceed with a MCMC algorithm to sample from this distribution. This will lead to a stochastic version of the EM algorithm, namely the SAEM algorithm.
We need the sufficient statistics to update the algorithm.
The statistics are
\[\begin{eqnarray*} S_1(\xi_i)& =& \frac{1}{n}\sum_{i=1}^n (Y_i -f(x_i(\xi_i), \varphi))^2\\ S_2(\xi_i) &=& \sum_{i=1}^n \xi_{i-1}\xi_i\\ S_3(\xi_i) &=& \sum_{i=1}^n \xi_{i-1}^2\\ S_4(\xi_i) &=& \sum_{i=1}^n \xi_i^2 \end{eqnarray*}\]
The update of the parameters are based on these statistics.
The steps of the SAEM algorithm are
E step: simulation of a new trajectory \(\xi^k\) with a MCMC algorithm targeting \(p(\xi|Y; \theta^k)\) as stationary distribution
SA step: stochastic approximation of the sufficient statistics
\[\begin{eqnarray*} s_1^k &=& s_1^{k-1} + (1-\alpha_k)(S_1(\xi^k)-s_1^{k-1}) \\ s_2^k &=& s_2^{k-1} + (1-\alpha_k)(S_2(\xi^k)-s_2^{k-1}) \\ s_3^k &=& s_3^{k-1} + (1-\alpha_k)(S_3(\xi^k)-s_3^{k-1}) \\ s_4^k &=& s_4^{k-1} + (1-\alpha_k)(S_4(\xi^k)-s_4^{k-1}) \\ \end{eqnarray*}\]
\[\begin{eqnarray*} \widehat{\varphi}^k &=& \arg\min_\varphi \sum_{i=1}^n \left(y_i - f(x_i(\xi_i^k), \varphi)\right)^2 \\ \widehat{\psi}^k &=& \frac{s_2^k}{s_3^k}\\ \widehat{\omega^{2}}^k &=& s_1^k\\ \widehat{\gamma^{2}}^k &=& \frac{1}{n}(\widehat{\psi^2}^ks_3^k-2\widehat{\psi}^ks_2^k+s_4^k) \end{eqnarray*}\]
source("../../R/2_tusk/utils/MCMC.R")
M <- 150
mcmc.obj <- mcmc.alg(Y, M)
Tirage
plot(xi, type = "l", col = "blue",
ylim = c(min(min(mcmc.obj$xi.c), min(xi)), max(max(mcmc.obj$xi.c), max(xi))))
lines(mcmc.obj$xi.c, type = "l", col = "red")
plot(Y, type = "l", col = "blue")
lines(f(mcmc.obj$xi.c), type = "l", col = "red")
plot(mcmc.obj$acceptance.rate[mcmc.obj$n.it,])
abline(h = acceptance.target, lty = "dashed", col = "red")
Ecarts
worst.idx <- which.max(abs(xi - mcmc.obj$xi.c))
ggplot(data = data.frame(x = 1:M), aes(x = x)) +
geom_line(aes(y = mcmc.obj$xi[, worst.idx]), color = "blue") +
geom_line(aes(y = mcmc.obj$xi.p[, worst.idx]), color = "red") +
geom_hline(aes(yintercept = xi[[worst.idx]])) +
ylab("Y")
best.idx <- which.min(abs(xi - mcmc.obj$xi.c))
ggplot(data = data.frame(x = 1:M), aes(x = x)) +
geom_line(aes(y = mcmc.obj$xi[, best.idx]), color = "blue") +
geom_line(aes(y = mcmc.obj$xi.p[, best.idx]), color = "red") +
geom_hline(aes(yintercept = xi[[best.idx]])) +
ylab("Y")
farest.idx <- which.max(abs(xi))
ggplot(data = data.frame(x = 1:M), aes(x = x)) +
geom_line(aes(y = mcmc.obj$xi[, farest.idx]), color = "blue") +
geom_line(aes(y = mcmc.obj$xi.p[, farest.idx]), color = "red") +
geom_hline(aes(yintercept = xi[[farest.idx]])) +
ylab("Y")
coeff <- max(mcmc.obj$acceptance.rate[, worst.idx]) / max(mcmc.obj$delta[, worst.idx])
ggplot(data = data.frame(x = 1:M), aes(x = x)) +
geom_line(aes(y = mcmc.obj$acceptance.rate[, worst.idx]), color = "red") +
geom_line(aes(y = mcmc.obj$delta[, worst.idx] * coeff), color = "blue") +
geom_hline(aes(yintercept = acceptance.target * 1.1), linetype = 2, color = "red") +
geom_hline(aes(yintercept = acceptance.target * .9), linetype = 2, color = "red") +
scale_y_continuous(name = "Acceptance rate", sec.axis = sec_axis(~. / coeff, name = "Delta"))
coeff <- max(mcmc.obj$acceptance.rate[, best.idx]) / max(mcmc.obj$delta[, best.idx])
ggplot(data = data.frame(x = 1:M), aes(x = x)) +
geom_line(aes(y = mcmc.obj$acceptance.rate[, best.idx]), color = "red") +
geom_line(aes(y = mcmc.obj$delta[, best.idx] * coeff), color = "blue") +
geom_hline(aes(yintercept = acceptance.target * 1.1), linetype = 2, color = "red") +
geom_hline(aes(yintercept = acceptance.target * .9), linetype = 2, color = "red") +
scale_y_continuous(name = "Acceptance rate", sec.axis = sec_axis(~. / coeff, name = "Delta"))
xi au fil des k
par(mfrow = c(2, 2))
for (k.ind in round(seq(1, M, length.out = 10))) {
plot(xi, type = "l", col = "blue",
ylim = c(min(min(mcmc.obj$xi[k.ind,]), min(xi)), max(max(mcmc.obj$xi[k.ind,]), max(xi))),
main = paste("k=", k.ind))
lines(mcmc.obj$xi[k.ind,], type = "l", col = "red")
}
f au fil des k
par(mfrow = c(2, 2))
for (k.ind in round(seq(1, M, length.out = 10))) {
plot(Y, type = "l", col = "blue",
main = paste("k=", k.ind))
lines(f(mcmc.obj$xi[k.ind,]), type = "l", col = "red")
}
Identifiabilité
p1 <- rxi()
p2 <- rxi()
p3 <- rxi()
p4 <- rxi()
par(mfrow = c(2, 2))
plot(p1, type = "l", col = 1)
plot(p1, type = "l", col = 1, ylim = c(min(c(min(p1), min(p2))), max(c(max(p1), max(p2)))))
lines(p2, type = "l", col = 2)
plot(p1, type = "l", col = 1, ylim = c(min(c(min(p1), min(p3))), max(c(max(p1), max(p3)))))
lines(p3, type = "l", col = 3)
plot(p1, type = "l", col = 1, ylim = c(min(c(min(p1), min(p4))), max(c(max(p1), max(p4)))))
lines(p4, type = "l", col = 4)
par(mfrow = c(2, 2))
plot(f(p1), type = "l", col = 1)
plot(f(p1), type = "l", col = 1)
lines(f(p2), type = "l", col = 2)
plot(f(p1), type = "l", col = 1)
lines(f(p3), type = "l", col = 3)
plot(f(p1), type = "l", col = 1)
lines(f(p4), type = "l", col = 4)
par(mfrow = c(2, 2))
plot(Y, type = "l", col = 1, main = "goal")
plot(Y, type = "l", col = 1, main = "MCMC")
lines(f(mcmc.obj$xi.c), type = "l", col = 2)
plot(Y, type = "l", col = 1, main = "xi simulation")
lines(f(mcmc.obj$xi.c), type = "l", col = 2)
lines(f(p1), type = "l", col = 3)
plot(Y, type = "l", col = 1, main = "xi simulation")
lines(f(mcmc.obj$xi.c), type = "l", col = 2)
lines(f(p2), type = "l", col = 4)
source("../../R/2_tusk/utils/SAEM.R")
saem.obj <- saem.alg(Y)
plot(xi, type = "l", col = "blue",
ylim = c(min(min(saem.obj$mcmc$xi.c), min(xi)), max(max(saem.obj$mcmc$xi.c), max(xi))))
lines(saem.obj$mcmc$xi.c, type = "l", col = "red")
plot(Y, type = "l", col = "blue")
lines(f(saem.obj$mcmc$xi.c, A.arg = saem.obj$A.c, B.arg = saem.obj$B.c, a.arg = saem.obj$a.c, b.arg = saem.obj$b.c),
type = "l", col = "red")
plot(f(rep(0, n)), type = "l")
lines(f(rep(0, n), A.arg = saem.obj$A.c, B.arg = saem.obj$B.c, a.arg = saem.obj$a.c, b.arg = saem.obj$b.c),
type = "l", col = "red")
par(mfrow = c(2, 2))
plot(saem.obj$omega, type = 'l', col = 'blue', main = "Omega", ylim = c(min(c(saem.obj$omega, omega)), max(saem.obj$omega, omega)))
abline(h = omega, col = 'red')
plot(saem.obj$psi, type = 'l', col = 'blue', main = "Psi", ylim = c(min(c(saem.obj$psi, psi)), max(saem.obj$psi, psi)))
abline(h = psi, col = 'red')
plot(saem.obj$gamma, type = 'l', col = 'blue', main = "gamma", ylim = c(min(c(saem.obj$gamma, gamma)), max(saem.obj$gamma, gamma)))
abline(h = gamma, col = 'red')
plot(saem.obj$A, type = 'l', col = 'blue', main = "A", ylim = c(min(c(saem.obj$A, A)), max(saem.obj$A, A)))
abline(h = A, col = 'red')
plot(saem.obj$B, type = 'l', col = 'blue', main = "B", ylim = c(min(c(saem.obj$B, B)), max(saem.obj$B, B)))
abline(h = B, col = 'red')
plot(saem.obj$b, type = 'l', col = 'blue', main = "b", ylim = c(min(c(saem.obj$b, b)), max(saem.obj$b, b)))
abline(h = b, col = 'red')
plot(saem.obj$a, type = 'l', col = 'blue', main = "a", ylim = c(min(c(saem.obj$a, a)), max(saem.obj$a, a)))
abline(h = a, col = 'red')
Plan d’expérience
params.true <- c(omega = omega, psi = psi, gamma = gamma, A = A, B = B, a = a, b = b)
params.est <- readRDS("../../data/2_tusk/parameters.est.rds")
apply(params.est, 2, mean)
## omega psi gamma A B a
## 0.01784953 0.93182377 0.09719714 0.49804310 -0.24480915 0.10000059
## b
## 1.01001660
params.pe <- t(apply(params.est, 1, function (row) (params.true - row) / params.true) * 100)
params.mape <- apply(abs(params.pe), 2, mean)
params.se <- apply(params.est, 1, function (row) (params.true - row)^2)
params.rmse <- sqrt(apply(params.se, 1, mean))
df <- melt(params.pe, varnames = c("it", "param"))
ggplot(data = df) +
geom_boxplot(aes(x = param, y = value))
params.mape
## omega psi gamma A B a b
## 78.4953366 2.1691948 6.0136665 0.6686644 2.1343387 0.4716401 14.3315989
params.rmse
## omega psi gamma A B a
## 0.0084336743 0.0265938232 0.0078524878 0.0318171392 0.0062878441 0.0005930824
## b
## 0.1969258133